-
Notifications
You must be signed in to change notification settings - Fork 20
Solve issues with mutexes while running tests on MacOS M1 #677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #677 +/- ##
=======================================
Coverage 96.24% 96.24%
=======================================
Files 112 112
Lines 6278 6278
=======================================
Hits 6042 6042
Misses 236 236 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 0.7
git worktree add -d .worktree/backport-677-to-0.7 origin/0.7
cd .worktree/backport-677-to-0.7
git switch --create backport-677-to-0.7
git cherry-pick -x 205f6bc2913e19442801cb78350616cf80919306 |
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 0.8
git worktree add -d .worktree/backport-677-to-0.8 origin/0.8
cd .worktree/backport-677-to-0.8
git switch --create backport-677-to-0.8
git cherry-pick -x 205f6bc2913e19442801cb78350616cf80919306 |
Successfully created backport PR for |
* refactor to use RAII when locking mutexes * capture mutex shared pointer by value to ensure validity at del time * Avoid deadlock involving GIL, and indexGuard * revert change to barrier and guard scoped locks * add small comment (cherry picked from commit 205f6bc)
Solve issues with mutexes while running tests on MacOS M1 (#677) * refactor to use RAII when locking mutexes * capture mutex shared pointer by value to ensure validity at del time * Avoid deadlock involving GIL, and indexGuard * revert change to barrier and guard scoped locks * add small comment (cherry picked from commit 205f6bc) Co-authored-by: Joan Fontanals <jfontanalsmartinez@gmail.com>
Describe the changes in the pull request
2 issues are handled in this PR:
1- An
abort
was identified while running a test related toPyBatchIterator
. The abort happened when the test was being shutdown. The error comes from thedeleter
passed to thePyBatchIterator
shared_ptr
which captures thePyIndex
by reference. The problem is that there is no guarantee that thePyIndex
lifetime will survive theIterator
. In that deleter amutex
is released (which could be in an invalid state). The solution is to have themutex
be wrapped in a shared_ptr that will share the ownership with thePyBatchIterator
deleter as it captures it by value (incrementing its reference count).2- Deadlock found on
tests/flow/test_hnsw_parallel.py::test_parallel_insert_batch_search
:indexGuard
locked in shared mode, and tries to recover the GIL in the getNextResultindexGuard
exclusively, while N-1 has the indexGuard in shared mode. (WAIT)indexGuard
in shared mode (It goes in the queue) but cannot get it in shared mode (I guess is System dependant if they would schedule a reader to bypass the writer ) potential starvation. Thread N+1 has the GIL.This is solved by releasing the GIL in
Thread N+1
before asking forindexGuard
.